Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Extraction of character areas from digital camera based color document images and OCR system

Identifieur interne : 001398 ( Main/Exploration ); précédent : 001397; suivant : 001399

Extraction of character areas from digital camera based color document images and OCR system

Auteurs : Y. K. Chung [Corée du Sud] ; S. Y. Chi [Corée du Sud] ; K. S. Bae [Corée du Sud] ; K. K. Kim [Corée du Sud] ; D. Jang [Corée du Sud] ; K. C. Kim [Corée du Sud] ; Y. W. Choi [Corée du Sud]

Source :

RBID : Pascal:06-0282232

Descripteurs français

English descriptors

Abstract

When document images are obtained from digital cameras, many imaging problems have to be solved for better extraction of characters from the images. Variation of illumination intensity sensitively affects to color values. A simple colored document image could be converted to a monochrome image by a traditional method and then a binarization algorithm is used. But this method is not stably working to the variation of illumination because sensitivity of colors to variation of illumination. For narrowly distributed colors, the conversion is not working well. Secondly, in case that the number of colors is more than two, it is not easy to figure out which color is for character and which others are for background. This paper discusses about an extraction method from a colored document image using a color process algorithm based on characteristics of color features. Variation of intensities and color distribution are used to classify character areas and background areas. A document image is segmented into several color groups and similar color groups are merged. In final step, only two colored groups are left for the character and background. The extracted character areas from the document images are entered into optical character recognition system. This method solves a color problem, which comes from traditional scanner based OCR systems. This paper also describes the OCR system for character conversion of a colored document image. Our method is working for the colored document images of cellular phones and digital cameras in real world.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Extraction of character areas from digital camera based color document images and OCR system</title>
<author>
<name sortKey="Chung, Y K" sort="Chung, Y K" uniqKey="Chung Y" first="Y. K." last="Chung">Y. K. Chung</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Intelligent Robot Research Division, ETRI, Korea 161 Gajeong-dong</s1>
<s2>Yuseong-gu, Daejeon, 305-350</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>Yuseong-gu, Daejeon, 305-350</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Chi, S Y" sort="Chi, S Y" uniqKey="Chi S" first="S. Y." last="Chi">S. Y. Chi</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Intelligent Robot Research Division, ETRI, Korea 161 Gajeong-dong</s1>
<s2>Yuseong-gu, Daejeon, 305-350</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>Yuseong-gu, Daejeon, 305-350</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Bae, K S" sort="Bae, K S" uniqKey="Bae K" first="K. S." last="Bae">K. S. Bae</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Intelligent Robot Research Division, ETRI, Korea 161 Gajeong-dong</s1>
<s2>Yuseong-gu, Daejeon, 305-350</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>Yuseong-gu, Daejeon, 305-350</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Kim, K K" sort="Kim, K K" uniqKey="Kim K" first="K. K." last="Kim">K. K. Kim</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Intelligent Robot Research Division, ETRI, Korea 161 Gajeong-dong</s1>
<s2>Yuseong-gu, Daejeon, 305-350</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>Yuseong-gu, Daejeon, 305-350</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Jang, D" sort="Jang, D" uniqKey="Jang D" first="D." last="Jang">D. Jang</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Intelligent Robot Research Division, ETRI, Korea 161 Gajeong-dong</s1>
<s2>Yuseong-gu, Daejeon, 305-350</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>Yuseong-gu, Daejeon, 305-350</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Kim, K C" sort="Kim, K C" uniqKey="Kim K" first="K. C." last="Kim">K. C. Kim</name>
<affiliation wicri:level="1">
<inist:fA14 i1="02">
<s2>134 Shinchon-dong, Seodamoon-gu, Seoul, 120-749</s2>
<s3>KOR</s3>
<sZ>6 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>134 Shinchon-dong, Seodamoon-gu, Seoul, 120-749</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Choi, Y W" sort="Choi, Y W" uniqKey="Choi Y" first="Y. W." last="Choi">Y. W. Choi</name>
<affiliation wicri:level="3">
<inist:fA14 i1="03">
<s1>Dept. of Computer Science, Sookmyung Women's University, Korea 53-12 Chungpa-dong</s1>
<s2>Youngsan-gu, Seoul</s2>
<s3>KOR</s3>
<sZ>7 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<placeName>
<settlement type="city">Séoul</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">06-0282232</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 06-0282232 INIST</idno>
<idno type="RBID">Pascal:06-0282232</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000387</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000399</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000408</idno>
<idno type="wicri:doubleKey">0277-786X:2005:Chung Y:extraction:of:character</idno>
<idno type="wicri:Area/Main/Merge">001438</idno>
<idno type="wicri:Area/Main/Curation">001398</idno>
<idno type="wicri:Area/Main/Exploration">001398</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Extraction of character areas from digital camera based color document images and OCR system</title>
<author>
<name sortKey="Chung, Y K" sort="Chung, Y K" uniqKey="Chung Y" first="Y. K." last="Chung">Y. K. Chung</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Intelligent Robot Research Division, ETRI, Korea 161 Gajeong-dong</s1>
<s2>Yuseong-gu, Daejeon, 305-350</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>Yuseong-gu, Daejeon, 305-350</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Chi, S Y" sort="Chi, S Y" uniqKey="Chi S" first="S. Y." last="Chi">S. Y. Chi</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Intelligent Robot Research Division, ETRI, Korea 161 Gajeong-dong</s1>
<s2>Yuseong-gu, Daejeon, 305-350</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>Yuseong-gu, Daejeon, 305-350</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Bae, K S" sort="Bae, K S" uniqKey="Bae K" first="K. S." last="Bae">K. S. Bae</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Intelligent Robot Research Division, ETRI, Korea 161 Gajeong-dong</s1>
<s2>Yuseong-gu, Daejeon, 305-350</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>Yuseong-gu, Daejeon, 305-350</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Kim, K K" sort="Kim, K K" uniqKey="Kim K" first="K. K." last="Kim">K. K. Kim</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Intelligent Robot Research Division, ETRI, Korea 161 Gajeong-dong</s1>
<s2>Yuseong-gu, Daejeon, 305-350</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>Yuseong-gu, Daejeon, 305-350</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Jang, D" sort="Jang, D" uniqKey="Jang D" first="D." last="Jang">D. Jang</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Intelligent Robot Research Division, ETRI, Korea 161 Gajeong-dong</s1>
<s2>Yuseong-gu, Daejeon, 305-350</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>Yuseong-gu, Daejeon, 305-350</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Kim, K C" sort="Kim, K C" uniqKey="Kim K" first="K. C." last="Kim">K. C. Kim</name>
<affiliation wicri:level="1">
<inist:fA14 i1="02">
<s2>134 Shinchon-dong, Seodamoon-gu, Seoul, 120-749</s2>
<s3>KOR</s3>
<sZ>6 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<wicri:noRegion>134 Shinchon-dong, Seodamoon-gu, Seoul, 120-749</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Choi, Y W" sort="Choi, Y W" uniqKey="Choi Y" first="Y. W." last="Choi">Y. W. Choi</name>
<affiliation wicri:level="3">
<inist:fA14 i1="03">
<s1>Dept. of Computer Science, Sookmyung Women's University, Korea 53-12 Chungpa-dong</s1>
<s2>Youngsan-gu, Seoul</s2>
<s3>KOR</s3>
<sZ>7 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
<placeName>
<settlement type="city">Séoul</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<idno type="ISSN">0277-786X</idno>
<imprint>
<date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<idno type="ISSN">0277-786X</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Color image</term>
<term>Document processing</term>
<term>Feature extraction</term>
<term>Image processing</term>
<term>Pattern extraction</term>
<term>Segmentation</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Traitement image</term>
<term>Image couleur</term>
<term>Traitement document</term>
<term>Reconnaissance caractère</term>
<term>Extraction forme</term>
<term>Segmentation</term>
<term>Extraction caractéristique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">When document images are obtained from digital cameras, many imaging problems have to be solved for better extraction of characters from the images. Variation of illumination intensity sensitively affects to color values. A simple colored document image could be converted to a monochrome image by a traditional method and then a binarization algorithm is used. But this method is not stably working to the variation of illumination because sensitivity of colors to variation of illumination. For narrowly distributed colors, the conversion is not working well. Secondly, in case that the number of colors is more than two, it is not easy to figure out which color is for character and which others are for background. This paper discusses about an extraction method from a colored document image using a color process algorithm based on characteristics of color features. Variation of intensities and color distribution are used to classify character areas and background areas. A document image is segmented into several color groups and similar color groups are merged. In final step, only two colored groups are left for the character and background. The extracted character areas from the document images are entered into optical character recognition system. This method solves a color problem, which comes from traditional scanner based OCR systems. This paper also describes the OCR system for character conversion of a colored document image. Our method is working for the colored document images of cellular phones and digital cameras in real world.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Corée du Sud</li>
</country>
<settlement>
<li>Séoul</li>
</settlement>
</list>
<tree>
<country name="Corée du Sud">
<noRegion>
<name sortKey="Chung, Y K" sort="Chung, Y K" uniqKey="Chung Y" first="Y. K." last="Chung">Y. K. Chung</name>
</noRegion>
<name sortKey="Bae, K S" sort="Bae, K S" uniqKey="Bae K" first="K. S." last="Bae">K. S. Bae</name>
<name sortKey="Chi, S Y" sort="Chi, S Y" uniqKey="Chi S" first="S. Y." last="Chi">S. Y. Chi</name>
<name sortKey="Choi, Y W" sort="Choi, Y W" uniqKey="Choi Y" first="Y. W." last="Choi">Y. W. Choi</name>
<name sortKey="Jang, D" sort="Jang, D" uniqKey="Jang D" first="D." last="Jang">D. Jang</name>
<name sortKey="Kim, K C" sort="Kim, K C" uniqKey="Kim K" first="K. C." last="Kim">K. C. Kim</name>
<name sortKey="Kim, K K" sort="Kim, K K" uniqKey="Kim K" first="K. K." last="Kim">K. K. Kim</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001398 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001398 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:06-0282232
   |texte=   Extraction of character areas from digital camera based color document images and OCR system
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024